Efficient Lossless Compression of Trees and Graphs

نویسندگان

  • Shenfeng Chen
  • John H. Reif
چکیده

Data compression algorithms have been widely used in many areas to meet the demand of storage and transfer of large size data. Most of the data compression algorithms regard the input as a sequence of binary numbers and represent the compressed data also as a binary sequence. However, in many areas such as programming languages (e.g. LISP and C) and compiler design, it is more desirable to have a compression algorithm which compresses a data structure while keeping a similar structure of the original data in the compressed form. In addition to reducing storage space, such compression also has the benefit of efficiently executing various operations (e.g. searching) in the compressed form. In this paper, we study the problem of compressing a non-binary data structure (e.g. tree, undirected and directed graphs) in an efficient way while keeping a similar structure in the compressed form. To date, there has been no proven optimal algorithm for this problem. We use the idea of building LZW tree in LZW compression to compress a binary tree generated by a stationary ergodic source. The tree is parsed into subtrees using breadth first search and an op timal dictionary is constructed with each index pointing to a distinct subtree. We replace the parsed subtrees by dictionary indices to formed the compressed tree. We also extend our tree compression algorithm to compress undirected and directed acyclic graphs. *Email addresses: [email protected] and [email protected]. This work was supported by NSF Grant. NSF-IRI-91-00681, Rome Labs Contracts F30602-94G0037, ARPA/SISTO contracts N0001491-J-1985, and NO001492-C-0182 under subcontract KI-92-01-0182. 428 106%0314/96$5.00

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lossless Microarray Image Compression by Hardware Array Compactor

Microarray technology is a new and powerful tool for concurrent monitoring of large number of genes expressions. Each microarray experiment produces hundreds of images. Each digital image requires a large storage space. Hence, real-time processing of these images and transmission of them necessitates efficient and custom-made lossless compression schemes. In this paper, we offer a new archi...

متن کامل

cient Lossless Compression of Trees and Graphs

In this paper, we study the problem of compressing a data structure (e.g. tree, undirected and directed graphs) in an eecient way while keeping a similar structure in the compressed form. To date, there has been no proven optimal algorithm for this problem. We use the idea of building LZW tree in LZW compression to compress a binary tree generated by a stationary ergodic source in an optimal ma...

متن کامل

Toward Remote Object Coherence with Compiled Object Serialization for Distributed Computing with XML Web Services

Cross-platform object-level coherence in Web services-based distributed systems and grids requires lossless serialization to ensure programming-language specific objects are safely transmitted, manipulated, and stored. However, Web services development tools often suffer from lossy forms of XML serialization, which diminishes the usefulness of XML Web services as a competitive approach to binar...

متن کامل

Lossless Compression of Chemical Fingerprints Using Integer Entropy Codes Improves Storage and Retrieval

Many modern chemoinformatics systems for small molecules rely on large fingerprint vector representations, where the components of the vector record the presence or number of occurrences in the molecular graphs of particular combinatorial features, such as labeled paths or labeled trees. These large fingerprint vectors are often compressed to much shorter fingerprint vectors using a lossy compr...

متن کامل

Fast Algorithm for Optimal Compression of Graphs

We consider the problem of finding optimal description for general unlabeled graphs. Given a probability distribution on labeled graphs, we introduced in [4] a structural entropy as a lower bound for the lossless compression of such graphs. Specifically, we proved that the structural entropy for the Erdős–Rényi random graph, in which edges are added with probability p, is ` n 2 ́ h(p)−n log n+O(...

متن کامل

A Scalable to Lossless Audio Compression Scheme

This paper outlines a scalable to lossless coder, that is the coder presented is a scalable coder that scales from lossy quality to lossless quality. Lossless compression is achieved by concatenating a lossy, scalable transform coder with a scalable scheme for the compression of the synthesis error signal. The lossless compression results obtained are comparable with the state of the art in los...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996